AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Efficient Vision-Language Model

# Efficient Vision-Language Model

Openfly Agent 7b
MIT
OpenFly is a platform for aerial vision-language navigation, providing a multi-functional toolchain and large-scale benchmarking.
Multimodal Fusion Transformers English
O
IPEC-COMMUNITY
234
0
Xgen Mm Vid Phi3 Mini R V1.5 128tokens 8frames
xGen-MM-Vid (BLIP-3-Video) is an efficient compact vision-language model equipped with an explicit temporal encoder, specifically designed for video content understanding.
Video-to-Text Safetensors English
X
Salesforce
398
11
Nanollava
Apache-2.0
nanoLLaVA is a 1B-parameter vision-language model specifically designed for edge devices, featuring efficient operation.
Text-to-Image Transformers English
N
qnguyen3
2,851
154
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase